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TITLE: Library analysis of SCHEMA- guided protein 

recombination 

Meyer, Michelle M. ; Silberg, Jonathan J.; Voigt, 
Christopher A.; Endelman, Jeffrey B.; Mayo, 
Stephen L. ; Wang, Zhen-Gang; 
Arnold, Frances H. 
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Protein Science (2003), 12(8), 1686-1693 
CODEN: PRCIEI; ISSN: 0961-8368 
Cold Spring Harbor Laboratory Press 
Journal 
English 

The computational algorithm SCHEMA was developed to est. the disruption 
caused when amino acid residues that interact in the three-dimensional 
structure of a protein are inherited from different parents upon 
recombination. To evaluate how well SCHEMA predicts disruption, we have 
shuffled the distantly-related .beta. -lactamases PSE-4 and TEM-1 at 13 
sites to create a library of 214 (16,384) chimeras and examd. which ones 
retain lactamase function. Sequencing the genes from ampicillin-selected 
clones revealed that the percentage of functional clones decreased 
exponentially with increasing calcd. disruption (E = the no. of 
residue-residue contacts that are broken upon recombination) . We also 
found that chimeras with low E have a higher probability of maintaining 
lactamase function than chimeras with the same effective level of mutation 
but chosen at random from the library. Thus, the simple distance metric 
used by SCHEMA to identify interactions and compute E allows one to 
predict which chimera sequences are most likely to retain their function. 
This approach can be used to evaluate crossover sites for recombination 
and to create highly mosaic, folded chimeras. 



REFERENCE COUNT: 



27 THERE ARE 27 CITED REFERENCES AVAILABLE FOR THIS 
RECORD. ALL CITATIONS AVAILABLE IN THE RE FORMAT 



L101 ANSWER 2 OF 9 
ACCESSION NUMBER: 
DOCUMENT NUMBER: 
TITLE: 
AUTHOR (S) : 



CAPLUS COPYRIGHT 2004 ACS on STN DUPLICATE 2 
2002:478068 CAPLUS 
137:258440 

Protein building blocks preserved by recombination 
Voigt, Christopher A. ; Martinez, Carlos; 
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Biochemistry and Molecular Biophysics, California 
Institute of Technology, Pasadena, CA, 91125, USA 
Nature Structural Biology (2002), 9(7), 553-558 
CODEN: NSBIEW; ISSN: 1072-8368' 
Nature Publishing Group 
Journal 
English 

Borrowing concepts from the schema theory of genetic algorithms, we have 
developed a computational algorithm to identify the fragments of proteins, 
or schemas, that can be recombined without disturbing the integrity of the 
three-dimensional structure. When recombination leaves these schemas 
undisturbed, the hybrid proteins are more likely to be folded and 
functional. Crossovers found by screening libraries of several randomly 
shuffled proteins for functional hybrids strongly correlate with those 
predicted by this approach. Exptl. results from the construction of 
hybrids of two .beta. -lactamases that share 40% amino acid identity 
demonstrate a threshold in the amt . of schema disruption that the hybrid 
protein can tolerate. To the extent that introns function to promote 
recombination within proteins, natural selection would serve to bias their 
locations to schema boundaries. 
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AB The invention relates to improved methods for directed evolution of 

polymers, including directed evolution of nucleic acids and proteins. 
Specifically, the methods of the invention include anal, methods for 
identifying "crossover locations" in a polymer. Crossovers at these 
locations are less likely to disrupt desirable properties of the protein, 
such as stability or functionality. The invention further provides 
improved methods for directed evolution wherein the polymer is selectively 
recombined at the identified "crossover locations". Crossover disruption 
profiles can be used to identify preferred crossover locations. 
Structural domains of a biopolymer can also be identified and analyzed, 
and domains can be organized into schema. Schema disruption profiles can 
be calcd. , for example based on conformational energy or interat. 
distances, and these can be used to identify preferred or candidate 
crossover locations. Computer systems for implementing anal, methods of 
the invention are also provided. Examples of the invention include 
computational calcns. of regions of . beta . -lactamase in which 
crossovers/in vitro recombination would disrupt protein structure, calcns. 
of a probability distribution for disruption of protein (sub) structures of 
computationally-generated recombinant mutants, and comparison of a 
predicted protein disruption profile with exptl. obsd. recombination 
crossover points. 
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AB The invention relates to improved methods for directed evolution of 

polymers, including directed evolution of nucleic acids and proteins. 
Specifically, the methods of the invention include anal, methods for 
identifying "structurally tolerant" residues of a polymer. Mutations of 
these, structurally tolerant residues are less likely to adversely affect 
desirable properties of a polymer sequence. The invention further 
provides improved methods for directed evolution wherein the structurally 
tolerant residues of a polymer are selectively mutated. Computer systems 
for implementing anal, methods of the invention are also provided. 
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We introduce a computational method to optimize the in vitro evolution of 
proteins. Simulating evolution with a simple, model that statistically, 
describes the fitness landscape, we find that beneficial mutations tend to 
occur at amino acid positions that are tolerant to substitutions, in the 
limit of small libraries and low mutation rates. We transform this 
observation into a design strategy by applying mean-field theory to a 
structure-based computational model to calc. each residue f s structural 
tolerance. Thermostabilizing and activity-increasing mutations 
accumulated during the exptl. directed evolution of subtilisin E and T4 
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lysozyme are strongly directed to sites identified by using this 
computational approach. This method can be used to predict positions 
where mutations are likely to lead to improvement of specific protein 
properties. 
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Directed evolution has proven to be a successful strategy for the 
modification of enzyme properties. To date, the preferred exptl. 
procedure has been to apply mutations or crossovers randomly throughout 
the gene. With the emergence of powerful computational methods, it has 
become possible to develop focused combinatorial searches, guided by 
computer algorithms. Here, we describe several computational methods that 
have emerged to aid the optimization of mutant libraries, the targeting of 
specific residues for mutagenesis, and the design of recombination expts. 
REFERENCE COUNT: 31 THERE ARE 31 CITED REFERENCES AVAILABLE FOR THIS 
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AB The invention relates to improved methods for directed evolution of 

polymers, including directed evolution of nucleic acids and proteins. 
Specifically, the methods of the invention include anal, methods for 
identifying "crossover locations" in a polymer. Crossovers are sequences 
sepg. structurally or functionally important domains and changes at these 
locations are less likely to disrupt desirable properties of the protein, 
such as stability or functionality. The invention further provides 
improved methods for directed evolution wherein the polymer is selectively 
recombined at the identified "crossover locations". Crossover disruption 
profiles can be used to identify preferred crossover locations. 
Structural domains of a biopolymer can also be identified and analyzed, 
and domains can be organized into schema. Schema disruption profiles can 
be calcd., for example based on conformational energy or interat.. 
distances, and these can be used to identify preferred or candidate 
crossover locations. Computer systems for implementing anal, methods of 
the invention . are also provided. 
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NOVELTY - Residues of a particular polymer sequence are selected for 
mutation, by obtaining a level of structural tolerance for residues of the 
particular polymer sequence; and selecting structurally tolerant residues 
for mutation. 

DETAILED DESCRIPTION - INDEPENDENT CLAIMS are also included for: 

(1) a computer system for; analyzing a polymer sequence, comprising a 
memory; and a processor interconnected with the memory and having software 
component (s) causing the processor, to execute selection of residues of 
particular polymer sequence for mutation; 

(2) a computer program product comprising a computer readable medium 
having software component (s) encoded, in computer readable form; and 

(3) a method for directed evolution of a polymer, comprising 
providing a parent polymer sequence, which has properties of interest; 
selecting structurally tolerant residue (s) of the parent polymer sequence 
for mutation; generating from the parent polymer sequence, mutant polymer 
sequence (s) in which the selected residue (s) is mutated; and screening the 
mutant sequence (s) for properties of interest. 

USE - The invention is for selection of residues of particular 
polymer sequence comprising a sequence of amino acid residues and 
nucleotide residues, for mutation or for directed evolution of a polymer. 
The polymer comprises a polypeptide, which is one of TEM-1 and PSE-4, 
where the amino acid residue (s) at positions 39, 90, 99, 140, 158, 198, 
and 227 is substituted. The polypeptide comprises a substitution ( s ) 
consisting of Q-R and Q-N at residue 39, Q-S at residue 90, Q-R at residue 
99, T-K, T-A, and. T-N at residue 140, H-Y at residue 158, L-I at residue 
198, and A-D at residue 227. (all claimed). 

ADVANTAGE - The invention eliminates reduces the random mutagenesis 
of known methods, and provides a more targeted approach with improved, 
efficiency. It is straightforward and is computationally tractable.' 

DESCRIPTION OF DRAWING (S) - The figure is a flow diagram illustrating 
a selection of residues of particular polymer sequence for mutation. 
Dwg.1/10 
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NOVELTY - Selecting a crossover location in a first biopolymer (A) , for 

recombination with second biopolymers comprises: 

(a) identifying coupling interaction between pairs of residues in 

(A) ; 

(b) generating data structures; 

(c) determining a crossover disruption related to the number of 
coupling interactions disrupted in the crossover mutant represented by the 
data structure; and 

(d) identifying a data structure having a crossover disruption below 
a threshold. 

DETAILED DESCRIPTION - Selecting (Ml) a crossover location in a first 
biopolymer having a first polymer sequence, for recombination with one or 
more second biopolymers, each having its own second polymer sequence 
comprises : 

(a) identifying coupling interaction between pairs of residues in the 
first polymer sequence; 

(b) generating a data structures, each data structure representing a 
crossover mutant comprising a recombination of the first and a second 
polymer sequence where each recombination has a different crossover 
location; 

(c) determining, for each data structure, a crossover disruption 
related to the number of coupling interactions disrupted in the crossover 
mutant represented by the data structure; and 

(d) identifying, among the data structures, a particular data 
structure having a crossover disruption below a threshold, where the 
crossover location of the crossover mutant represented by the particular 
data structure is the identified crossover location. 

INDEPENDENT CLAIMS are also included for the following: 

(1) a computer system for analyzing a polymer sequence; 

(2) directed evolution of a polymer; 

(3) a computer program comprising a computer readable medium having 
one or more software components encoded in computer readable form, where 
the one or more software components may be loaded into a memory of a 
computer system; 

(4) producing hybrid polymers from two or more parent polymers; 

(5) producing a library of hybrid polymers; 

(6) modelling the recombination of two or more parent polymers; 

(7) producing recombinant oligonucleotides from two or more parent 
oligonucleotides by a staggered extension process; 

(8) producing recombinant oligonucleotides from two or more parent 
oligonucleotides by an in vitro-in vivo recombination method; 

(9) producing recombinant oligonucleotides from two or more parent 
oligonucleotides by a PCR amplification method; 

(10) producing recombinant oligonucleotides from two or more parent 
oligonucleotides by a family shuffling method; 

(11) a beta-lactamase hybrid comprising the amino acid sequence of 
PSE-4, substituted in part by an amino acid sequence of TEM-1, where the 
substitution comprises amino acid residues 164-179, 190-216, 71-216, 
71-130, or 254 and higher of PSE-4 are replaced by the corresponding amino 
acid residues of TEM-1; and 

(12) a hybrid polymer comprising a first polypeptide recombined with 
at least a second polypeptide at one or more crossover locations selected 
according to a schema disruption threshold. 

USE - The method is useful for selecting a crossover location in a 
first biopolymer having a first polymer sequence, for recombination with 
one or more second biopolymers, each having its own second polymer 
sequence. The methods are also useful for producing hybrid polymers from 
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two or more parent polymers, producing a library of hybrid polymers, 
modelling the recombination o£ two or more parent polymers, producing 
recombinant oligonucleotides from two or more parent oligonucleotides by a 
staggered extension process, an in vitro-in vivo recombination method, by 
a PCR amplification method, or by a family shuffling method (claimed) . 
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US 2003198988 Al 20031023 US 2003-386903 20030310 

PRIORITY APPLN. INFO.: US 2002-363505P P 20020309 

US 2002-373591P P 20020418 
AB Methods and devices for more efficiently engineering diversity into 

recombinant polypeptides and/or nucleic acids are provided herein. For 
example, a variety of methods of selecting and/or assessing potential 
crossover sites in an amino acid sequence or a nucleotide sequence are 
provided, as well as the resulting chimeric product sequences. These 
methods include, e.g., consideration of structural, functional and/or 
statistical data in the selection and assessment of sequences and 
crossover sites for use in recombination. Wild type genes for muconate 
lactonizing enzyme, MLE 1 and 2 from Pseudomonas putida and MLE 1 gene 
from Acetobacter calcoaceticus were used in examples for engyneering 
enzymes with composite active sites. 
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AB A modeling framework for predicting the no., type and distribution of 
crossovers in directed evolution expts . is disclosed. The framework 
provides for detg. how fragmentation length, annealing temp., sequence 
identity, and no. of shuffled parent sequences affect the no., type, and 
distribution of crossovers along the length of reassembled sequences. 
This framework allows for the optimization of directed evolution protocols 
in response to a particular enzyme or protein design challenge. One 
method according to the present invention includes applying equil. 
thermodn. to a plurality of sequences to det. statistics of hybridization; 
and parameterizing an assembly algorithm using the statistics of 
hybridization. According to the framework of the present invention, the 
annealing events during reassembly are modeled as a network of reactions, 
and equil. thermodn. is used to quantify their conversions and 
selectivities . The key idea of the reassembly algorithm is to postulate a 
set of recursive relations that describe the probability that a 
full-length reassembled sequence involves a given no. of crossovers. An 
in silico case study of a set of 12 subtilases examines the effect of 
fragmentation length, annealing temp., sequence identity and no. of 
shuffled sequences on the no., type, and distribution of crossovers. A 
computational verification of .crossover aggregation in regions of 
near-perfect sequence identity and the presence of synergistic reassembly 
in family DNA shuffling is obtained. 
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General Method for Sequence- independent 
Site-directed Chimeragenesis 
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91125, USA 
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Elsevier Science Ltd. 
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simple and general method that allows for the facile 
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recombination of distantly related (or unrelated) proteins at multiple 
discrete sites. To evaluate the sequence-independent site-directed 
chimeragenesis (SISDC) method, we have recombined . beta . -lactamases TEM-1 
and PSE-4 at seven sites, examd. the quality of the chimeric genes 
created, and screened the library of 28 (256) chimeras for functional 
enzymes. Probe hybridization and sequencing analyses revealed that SISDC 
generated a random library with little sequence bias and in which all 
. targeted fragments were recombined in the desired order. Sequencing the 
'genes from clones having functional lactamases identified 14 unique 
chimeras. These chimeras are characterized by a lower level of 
disruption, as calcd. by the SCHEMA algorithm, than the library as a 
whole. These results illustrate the use of SISDC in creating designed 
chimeric protein libraries and further illustrate the ability of SCHEMA to 
identify chimeras whose folded structures are likely not to be disrupted 
by recombination. 
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mutation 

Wang, Ning; Akey, Joshua M. ; Zhang, Kun; Chakraborty, 
Ranajit; Jin, Li 

Center for Genome Information, University of 
Cincinnati, Cincinnati, OH, 45267-0056, USA 
American Journal of Human Genetics (2002), 71(5), 
1227-1234 

CODEN: AJHGAG; ISSN: 0002-9297 
University of Chicago Press 
Journal 
English 

Recent studies suggest that haplotypes are arranged into discrete 
blocklike structures throughout the human genome. Here, we present an 
alternative haplotype block definition that assumes no recombination 
within each block but allows for recombination between blocks, and we use 
it to study the combined effects of demog. history and various population 
genetic parameters on haplotype block characteristics. Through extensive 
coalescent simulations and anal, of published haplotype data on chromosome 
21, we find that (1) the combined effects of population demog. history, 
recombination, and mutation dictate haplotype block characteristics and 
(2) haplotype blocks can arise in the absence of recombination hot spots. 
Finally, we provide practical guidelines for designing and interpreting 
studies investigating haplotype block structure. 
REFERENCE COUNT: 19 THERE ARE 19 CITED REFERENCES AVAILABLE FOR THIS 

RECORD. ALL CITATIONS AVAILABLE IN THE RE FORMAT 
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Compound heterozygosity for Hb'S and the 

hybrid Hbs Lepore, P-Nilotic, and Kenya; 

comparison of hematological and hemoglobin composition 

data 

Huisman, T.H.J. 

Medical College of Georgia, Augusta, GA, 30912-2114, 
USA 
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DOCUMENT TYPE : Journal; General Review 
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AB A review with .31 refs. on the anal, of hematol . and Hb compn. data for 

patients with a heterozygosity for Hb Lepore, Hb P-Nilotic, and Hb Kenya, 
with and without a Hb S heterozygosity. The locations of the areas of 
crossover between the .delta.- and .beta . -globin genes leading to the 
formation of the .delta. .beta, genes of Hb Lepore anomalies/ and the 
.beta. .delta, gene of the Hb P-Nilotic abnormality are discussed. 

REFERENCE COUNT: 31 THERE ARE 31 CITED REFERENCES AVAILABLE FOR THIS 

RECORD. ALL CITATIONS AVAILABLE IN THE RE FORMAT 
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Construction of phylogenetic trees from amino acid 
sequences using a genetic algorithm 
Matsuda, Mdeo 

Dep. of Information and Computer Sciences, Osaka 
Univ., Osaka, 560, Japan 

Genome Informatics Series (1995), 6 (Genome Informatics 
Workshop 1995) , 19-28 
CODEN: GINSE9; ISSN; 0919-9454 
Universal Academy Press 
Journal 
English 

We have developed a novel algorithm to search for the max. 
constructed from amino acid sequences. This algorithm is 
genetic algorithms which uses scores derived from the log-likelihood of 
trees computed by the max. likelihood method. This algorithm is valuable . 
since it may construct a more likely tree from randomly generated trees by 
utilizing crossover and mutation operators. In a test of our algorithm on 
a data set of elongation factor-1 .alpha, sequences, we found that the 
performance of our algorithm is comparable to that of other 
tree-construction methods (UPGMA, the neighbor- joining and the max. 
parsimony methods; and the max. likelihood method with different search 
algorithms) . 
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TITLE: . Gene disruption in Lactobacillus plantarum 

strain 80 by site-specific recombination: Isolation of 
a mutant strain deficient in conjugated bile 
salt hydrolase activity 

Leer, R. J.; Christiaens, H.; Verstraete, W.; Peters, 
L.; Posno, M. ; Pouwels, P. H. 

Med. Biol. Lab., TNO, Rijswijk, 5815 HV, Neth. 
Molecular and General Genetics (1993), 239(1-2), 
269-72 

CODEN: MGGEAE; ISSN: 0026-8925 
Journal 
English 

A chloramphenicol-resistance gene (cml) was introduced into the 
Lactobacillus plantarum gene encoding conjugated bile acid hydrolase (cbh) 
on a ColEl replicon. This plasmid which is nonreplicative in 
Lactobacillus was used to transform L. plantarum strain 80. A homologous 
double cross-over recombination event resulted in replacement of the 
chromosomal cbh gene by the. cml-contg. cbh gene. The transf ormants 
obtained were unable to synthesize active conjugated bile acid hydrolase 
(Cbh) . The Cbh- CmlR phenotype was stably maintained for more than 100 
generations under nonselective conditions. 
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113:53523 

Identification of the crossing-over point of a 
hybrid gene encoding human glycophorin variant 
Sta. Similarity to the crossing-over point in 
haptoglobin-related genes 

Rearden, Ann; Phan Huan; Dubnicoff, Todd; Kudo, 
Shinichi; Fukuda, Minoru 
Dep. Pathol. , Univ. California, 
CA, 92093, USA 

Journal of Biological Chemistry 
925-9-63 

CODEN: JBCHA3; ISSN: 0021-9258 
Journal 
English 

One of the human glycophorin variants, Stones (Sta), has been shown to be 
the product of a hybrid gene of which the 5 '-half is derived from the 
glycophorin B (GPB) gene and the 3' -half is derived from the glycophorin A 
(GPA) gene. The present study reveals the crossing-over point of this 
hybrid gene from the anal, of polymerase chain reaction products. The 
genomic sequences encompassing the region corresponding to exon 3 to exon 
4 of GPA were amplified by polymerase chain reaction with oligonucleotide 
primers synthesized according to GPA and GPB genomic sequences. After 
subcloning the products, the nucleotide sequences derived from GPA, GPB, 
and putative Sta genes were detd. Comparison of the nucleotide sequences 
of GPA, GPB, and Sta genes indicate that the crossing-over took place 200 
bp upstream from the first nucleotide of exon 4 . Intriguingly, the 
nucleotide sequence surrounding the putative crossing-over point is 
homologous to the crossing-over point proposed for x haptoglobin genes 
(Maeda, N . , et al., 1986). These results suggest strongly that homologous 
recombination through unequal crossing-over can be facilitated by specific 
genomic elements, such as those in common in these 2 crossing-over events. 
The present study also revealed that this Sta individual has a variant GPA 
gene; substitution of adenine for guanine at the nucleotide for codon 39 
results in substitution of lysine for arginine at amino acid 39, and loss 
of an SstI restriction site. 
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Chemical Physics (1 Aug. 2002) vol.281, no. 2-3, 
p. 393-408. 65 refs. 
Doc. No.: S0301-0104 (02)00495-0 
Published by: Elsevier 
Price: CCCC 0301-0104 /02/$22 . 00 
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Journal 
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We present a kinetic-quantum model for the mechanisms of hole transport in 
DNA duplexes, which involves a sequence of hole hopping 

processes between adjacent guanines (G) and/or hole hopping/trapping via 
GG or GGG, all of which are separated by thymine (T) -adenine (A) bridges. 
Individual hole hopping processes between G sites fall into two distinct 
parallel mechanisms, i.e., unistep superexchange mediated hopping via 
'short 1 (T-A), bridges and thermally induced hopping (TIH) via 'long 1 
(T-A)n (n>3-4) bridges. The bridge specificity for TIH via (A)n chains 
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pertains to the energetics, with the G+A energy gap Delta =0 . 20+or-0 . 05 eV 
being sufficiently low to warrant endothermic hole excitation from G+ to 
(A)n, and to the electronic couplings, with the nearest-neighbor 
A-A couplings being unique in the sense that the intrastrand and 
interstrand couplings are close and large (V(A-A) 

approximately=0 . 30-0 . 060 eV) . Accordingly, both effective intrastrand and 
interstrand (zigzagging) hole transport via (A), chains will prevail, 
being nearly invariant with respect to the nucleobases ordering within the 
(T-A)n duplex. We treated the 'transition 1 between the superexchange and 
the TIH mechanism in 5 ' -G+ (T-A) nG-3 1 duplexes to predict that the 
crossover occurs at nx approximately=3-4 , with nx exhibiting a 
moderate bridge specificity and energy gap dependence, nx is in accord 
with the experimental data of Giese et al. [Nature 412, 318, 2001] . We 
assert that the kinetic-quantum mechanical model for the chemical yields 
and elementary rates cannot be" reconciled with the experimental TIH data, 
with respect to the very weak bridge size dependence of the relative 
chemical yields and the ratios of the rates. Conf igurational relaxation 
accompanying endothermic hole injection from G+ to (A)n may result in the 
gating (switching-of f ) of the backrecombination, providing a reasonable 
description of TIH dynamics and very long-range hole transport in long 
(A)n chains. 
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SOURCE: IEEE Transactions on Evolutionary Computation (June 

2002) vol.6, no. 3, p. 306-13. 30 refs. 

Doc. No.: S1089-778X(02) 06068-X 

Published by: IEEE 

Price: CCCC 1089-778X/02/$17 . 00 

CODEN: ITEVF5 ISSN: 1089-778X 

SICI: 1089-778X(200206) 6 : 3L . 306 : ICC; 1-6 
DOCUMENT TYPE: Journal 
TREATMENT CODE: Theoretical 
COUNTRY: United States 

LANGUAGE: English 

AB Among the different mechanisms employed by evolutionary algorithms 
, it can be argued that recombination, or crossover, 

is the most original, intuitively appealing and useful in an engineering 
perspective. It is a simple, but natural trick to combine elements of two 
good individuals in the hopes of generating a better one and, in 
particular, by combining the elements that make these solutions good in 
isolation. The trick of recombination can be seen not only in 
genetic systems, but also in immune and chemical systems as well. This 
paper describes and explains these latter recombination 

mechanisms, first from a biological or chemical perspective, then from an 
engineering perspective. With regard to crossover in immune 
systems, several algorithmic mechanisms have already been proposed (e.g. 
IRM, GA-Simplex, STEP) and these are reviewed. Their basic functionality 
in each case is the same: new individuals are created in a zone of the 
search space that is shaped by the position of the current solutions, 
together with their fitness values. When the immune system proposes a new 
cell, the profile of this new candidate evidences a huge diversity, 
providing its adaptive capability, but this is subject to a subsequent 
"recruitment test" under the selective pressure of the current population 
of cells. With regard to crossover in chemical reactions, these 
can be viewed as a combination of' computational graphs coupled 
with the distribution of the fitness values assigned to components in the 
graphs, as is already evidenced in particular instances of genetic 
algorithms and genetic programming. The benefits that these new 
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features allow are discussed, along with other possible positive 
influences that come from chemistry. Finally, the paper shows how 
chemistry and immunology converge to this same basic message, which is in 
line with classical optimization techniques: exploit the information 
contained in the current population of solutions better before proposing a 
new candidate to be evaluated. > 
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Confinement, dimensional crossover and 
topological coupling in quasi one 
dimensional electronic systems. 
Brazovskii, S. (LPTMS, Univ. de 
France) 

Synthetic Metals (15 March 2001) 
p. 691-4. 12 refs. 
Doc. No.: S0379-6779{00)01128-0 
Published by: Elsevier 
Price: CCCC 037 9-6779/2001/$20 . 
CODEN: SYMEDZ ISSN: 0379-6779 

SICI: 037 9-6779(20010315) 120: 1/3L. 691: CDCT; 1-N 
Conference: International Conference on Science and 
Technology of Synthetic Metals. Gastein, Austria, 
15-21 July 2000 
Conference Article; Journal 
Theoretical 
Switzerland 
English 

Topologically nontrivial states are common in symmetry broken phases at 
macroscopic scales. Low dimensional systems bring them to a microscopic 
level where solitons emerge as single particles. The examples are 
conducting polymers and spin-Peierls chains. We shall discuss 
topological aspects of elementary excitations, especially the confinement 
and the dimensional D crossover. At D>1 the topological 

requirements for the combined symmetry originate the spin- or charge-roton 
like excitations with charge- or spin- kinks localized in the core. In a 
quasi ID world they can be viewed as resulting from a spin-charge 
recombination due to the 2D or 3D confinement. 
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Conventional and multirecombinative evolutionary 
algorithms for the parallel task scheduling 
problem. 

Esquivel, S.; Gatica, C. ; Gallard, R. (Laboratorio de 
Investigacion y Desarrollo en Inteligencia 
Computacional, Univ. Nacional de San Luis, Argentina) 
Applications of Evolutionary Computing. EvoWorkshops 
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AB This paper deals with the problem of allocating a number of non identical 
tasks in a parallel system. The model assumes that the system consists of 
a number of identical processors and that only one task may be executed on 
a processor at a time. All schedules and tasks are nonpreemptive . R.L. 
Graham's (1972) well-known list scheduling algorithm (LSA) is 
contrasted with different evolutionary algorithms (EAs) , which 
differ on the representations and the recombinative approach used. 
Regarding representation, direct and indirect representation of schedules 
are used. Concerning recombination, the conventional single 
crossover per couple (SCPC) and a multiple 
crossover per couple (MCPC) are used. Outstanding 
behaviour of evolutionary algorithms when contrasted against LSA 
was detected. Results are shown and discussed. 
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Multiple crossovers between multiple parents to 
improve search in evolutionary algorithms. 
Esquivel, S.C. (Dept. de Inf., Univ. Nacional de San 
Luis, Argentina); Leiva, H.A.; Gallard, R.H. 
Proceedings of the 1999 Congress on Evolutionary 
Computation-CEC99 (Cat. No. 99TH8406) 

Piscataway, NJ, USA: IEEE, 1999. p. 1589-94 Vol. 2 of 3 
vol. (xxxvii+2348) pp. 19 refs. 
Conference: Washington, DC, USA, 6-9 July 1999 
Price: CCCC 0 7803 5536 9/99/$10.00 
ISBN: 0-7803-5536-9 
Conference Article 
Theoretical 
United States 
English 

As a new promising crossover method, multiple crossovers per 
couple (MCPC) deserves special attention in the evolutionary 
computing field. Allowing multiple crossovers per couple on a 
selected pair of parents provided an extra benefit in processing time and 
similar quality of solutions when contrasted against the conventional 
approach, which applies a single crossover operation per 
couple. These results were confirmed when optimising classic 
testing functions and harder (non-linear, non-separable) functions. 
Despite these benefits, due to a reinforcement of selective pressure, MCPC 
showed in some cases an undesirable premature convergence effect. An 
adequate balance between exploitation and exploration can improve search. 
Extreme exploitation can lead to premature convergence and intense 
exploration can make the search ineffective. Focussing on this equilibrium 
problem, a previous proposal combined MCPC with an alternative selection 
method; fitness proportional couple selection (FPCS) which first 
creates an intermediate population couples where both 
individuals were chosen proportional selection. Then a criterion is 
applied to establish the fitness of a couple and subsequently, 
couples are selected for crossing-over based 

on couple fitness. This paper investigates the raw effect in 
performance on a pair of selected optimization problems by using a new 
multiple crossovers on multiple parents (MCMP) method, which allows 
multiple recombination of multiple parents under uniform 
scanning crossover. 
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Preserving locality for optimal parallelism in task 
allocation. 

Schoneveld, A./ De Ronde, J.F.; Sloot, P.M. A. (Dept. 
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High-Performance Computing and Networking. 
International Conference and Exhibition. Proceedings 
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xxi+1066 pp. 13 refs. 
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ISBN: 3-540-62898-3 
Conference Article 

Practical; Theoretical; Experimental 
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English 

Genetic algorithms have been applied to several combinatorial 
optimisation problems, including the well-known task allocation problem, 
originating from parallel computing. We introduce random task graphs as a 
model of applications which display irregular global communication 
patterns. Uniform crossover is the standard genetic 
recombination operator that is applied to solution-encoded 
chromosomes. However, application of a uniform crossover may 
heavily disrupt low-cost sub-solutions, or building blocks, of a 
chromosome. Therefore, we define a locality-preserving 
recombination operator, exploiting the connectivity of the task 
graph. Experiments show that this new operator increases the convergence 
rate of the genetic algorithm applied to the task allocation 
problem. 



DOCUMENT TYPE: 
TREATMENT CODE 
COUNTRY: 
LANGUAGE : 
AB 



L105 ANSWER 15 OF 33 
ACCESSION NUMBER: 
DOCUMENT NUMBER: 
TITLE: 
AUTHOR: 
SOURCE: 



INSPEC (C) 2004 IEE on STN 
1995:5080342 INSPEC 
B9511-0260-031; C9511-1180-074 
The usefulness of recombination. 

Hordijk, W.; Manderick, B. (Sante Fe Inst., NM, USA) 
Advances in Artificial Life. Third European Conference 
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Conference: Granada, Spain, 4-6 June 1995 
ISBN: 3-540-59496-5 
- Conference Article 
Theoretical 
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English 

In this paper, we examine the usefulness of recombination from 
two points of view. First, the problem of crossover 
disruption is investigated. This is done by comparing two genetic 
algorithms with different crossover operators (one-point 
and uniform) to each other on NK-landscapes with different values of K 
relative to N, and with different epistatic interactions (random and 
nearest neighbor) . Second, the usefulness of recombination in 
relation to the location of local optima in the fitness landscape is 
investigated. There appears to be a clear relation between the type of 
fitness landscape and the type of recombination that is most 
useful on this landscape. Furthermore, there also is a clear relation 
between the location of local optima in the fitness landscape and the 
usefulness of recombination. 
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Van Kemenade, C.H.M. (CWI, Amsterdam, Netherlands); 
Kok, J.N.; Eiben, A.E. 

1995 IEEE International Conference on Evolutionary 
Computation (Cat. No.95TH8099) 

New York, NY, USA: IEEE, 1995. p. 346-51 vol.1 of 2 
vol. xii+855 pp. 9 refs. 

Conference: Perth, WA, Australia, 29 No.v-1 Dec 1995 
Sponsor (s): IEEE Neural Network Council 
Price: CCCC 0 7803 2759 4/95/$4.00 
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Theoretical 
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In many genetic algorithm applications the objective is to find 
a (near) optimal solution using a limited amount of computation. Given 
these requirements it is difficult to find a good balance between 
exploration and exploitation. Usually such a balance is found by tuning 
the various parameters (like the selective pressure, population size, the 
mutation and crossover rate) of the genetic 

algorithm. As an alternative we propose simultaneous tuning of the 
selective pressure and the disruptiveness of the 

recombination operators . Our experiments show that the combination 
of a proper selective pressure and a highly disruptive 
recombination operator yields superior performance. The reduction 
mechanism used in a steady state GA has a strong influence on the optimal 
crossover disruptiveness. Using the worst fitness 

deletion strategy the building blocks present in the current best 

individuals are always preserved. This releases the crossover 

operator from the burden to maintain good building blocks and allows us to 

tune crossover disruptiveness to improve the search 

for better individuals. 
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This paper describes an automated process for the dynamic creation of a 
pattern-recognizing computer program consisting of initially unknown 
detectors, an initially-unknown iterative calculation incorporating the 
as-yet-uncreated detectors, and an initially-unspecified final calculation 
incorporating the results of the as-yet-uncreated iteration. The program's 
goal is to recognize a given protein segment as being a 

transmembrane domain or non-transmembrane area. The recognizing program to 
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solve this problem will be evolved using the recently developed genetic 
programming paradigm. Genetic programming starts with a primordial ooze of 
randomly generated computer programs composed of available programmatic 
ingredients and then genetically breeds the population using the Darwinian 
principle of survival of the fittest and the genetic crossover 
(sexual recombination) operation. Automatic function definition 
enables genetic programming to dynamically create subroutines (detectors) . 
When cross-validated, the best genetically-evolved recognizer achieves an 
out-of-sample correlation of 0.968 and an out-of-sample error rate of 
1.6%. This error rate is better than that recently reported for, five other 
methods . 
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A formal analysis of the role of multi-point 
crossover in genetic algorithms . 

De Jong, K. A. (Dept. of Comput . Sci . , George Mason 
Univ., Fairfax, VA, USA); Spears, W.M. 
Annals of Mathematics and Artificial Intelligence 
(April 1992) vol.5, no.l, p. 1-26. 11 ref s . 
CODEN: AMAIEC ISSN: 1012-2443 
Journal 
Theoretical 
Switzerland 
English 

Extends existing theoretical results in an attempt to provide a broader 
explanatory and predictive theory of the role of multi-point 
crossover in genetic algorithms. In particular, the 
authors extend the traditional disruption analysis to include 
two general forms of multi-point crossover: n-point 
crossover and uniform crossover. The authors also 
analyze two other aspects of multi-point crossover operators, 
namely their recombination potential and exploratory power. The 
results of this analysis provide a much clearer view of the role of 
multi-point crossover in genetic algorithms. The 

implications of these results on implementation issues and performance are 
discussed, and several directions for further research are suggested. 
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analysis of inbred families, v the Friedreich ataxia (FRDA) locus was 

localized in a 300-kb interval between the X104 gene and the 

microsatellite marker FR8 (D9S888) . By homology searches of the 

sequence databases, we identified XI 04 as the human 

tight junction protein ZO-2 gene. We generated a large-scale 

physical map of the FRDA region by pulsed-field gel electrophoresis 

analysis of genomic DNA and of three YAC clones derived from different 

libraries, and we constructed an uninterrupted cosmid contig spanning the 

FRDA locus. The cAMP-dependent protein kinase y-catalytic 

subunit gene was identified within the critical FRDA interval, but it was 

excluded as candidate because of its biological properties and because of 

lack of mutations in FRDA patients. Six new polymorphic markers 

were isolated between FR2 (D9S886) and FR8 (D9S888), which were used for 

homozygosity analysis in a family in which parents of an affected child 

are distantly related.. An ancient recombination involving the 

centromeric FRDA flanking markers had been previously demonstrated in 

this family. Homozygosity analysis indicated that the FRDA gene is 

localized in the telomeric 150 kb of the FR2-FR8 interval. 
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Comparative modelling of proteins is a predictive technique to 
build an atomic model for a given amino acid sequence, on the 
basis of the structures of other proteins (templates) that have 
been determined experimentally. Critical problems arise in this procedure: 
selecting the correct templates, aligning the query sequence 
with them and building the non-conserved surface loops. In this work, we 
apply a genetic algorithm, with crossover and 
mutation, as a new tool to overcome the first two. In silico 
protein recombination proves to be an effective way to 
exploit the variability of templates and sequence alignments to 
produce populations of optimized models by artificial selection. Despite 
some limitations, the procedure is shown to be robust to alignment errors, 
while simplifying the task of selecting templates, making it a good 
candidate for automatic building of reliable protein models. 
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AB Resolvases and DNA invertases catalyse site-specific recombination 

by a concerted cut-and-religate mechanism. Topological data strongly 
suggest a rotational movement of the DNA half-sites during 
recombination: in an "iterative" mode of reaction, after cleavage 
of all four strands of the two recombining sites, the recombinase-linked 
half-sites seem to rotate through multiple steps of 180 degree prior to 
final religation. However, current structural data provide no clear 
support for the postulated corresponding rotation of enzyme subunits 
within an active tetramer. A key issue is whether repetition of apparent 
180 degree rotation steps requires rejoining of the DNA strands and 
resetting of the catalytic machinery, or if multiple rotation steps can 
take place in the fully cleaved intermediate. We present . evidence that a 
resolvase-catalysed DNA knotting reaction, brought about by apparent 360 
degree rotation, can proceed without rejoining of the DNA strands in the 
recombinant (180 degree rotation) configuration. This behaviour is not 
compatible with a mechanism requiring a fixed arrangement of the catalytic 
subunits, and strongly suggests that recombination is 
coupled to disruption of the dimer interface between two 
subunits bound at each crossover site. We also show that an 
artificial supercoiled plasmid containing two res sites, with a single 
mismatched base-pair in one of the crossover sites, is a 

substrate for "suicidal" reactions in which resolvase remains covalently 
linked to two half-sites. 
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AB The construction of clean, and unmarked mutations in bacteria, 

where a gene is replaced by an in vitro-modif ied allele is a fundamental 
approach to the understanding of pathogenicity at a molecular level, the 
definition of structure-function relationships, and the production of 
vaccine candidates. With the increasing availability of complete bacterial 
genome sequences, the potential for such mutagenesis 

has grown exponentially. However, to date a great number of open reading 
frames (ORFs) remain unannotated since they present no homology with 
sequences already present in the databases. The precise 

function of some of these unknown genes will probably be deduced through 
in silico predictions, by comparing different genomes, or by the use of 
modern genetic strategies such as serial analysis of gene expression, in 
vivo expression technology, representational difference analysis, and 
signature tagged mutagenesis that help in extraction of 
information without a priori knowledge of the sequence. 

Nevertheless, reverse genetic analysis, one of the logical approaches to 
be undertaken, will be necessary to identify a phenotype and to attribute 
a precise function to many undefined ORFs. Reverse genetics is a powerful 
approach for the identification of gene function, dn which the gene of 
interest is mutated or inactivated to study the resulting 

effects on the microorganism. Although allelic exchange is easy to perform 
with many bacteria, it remains very difficult or impractical with others. 
The classical method of using a. suicide plasmid that is unable to 
replicate in the studied strain to deliver an inactivated allele of the 
gene in the chromosome is often not efficient because the frequency of 
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double crossover events may be low and because illegitimate 
recombination may occur. Consequently, allelic exchange 
mutants may represent only a small fraction of the transf ormants 
and may be difficult to isolate. Counterselectable markers are often 
instrumental for the construction of such mutants, especially in 
microorganisms for which the genetics schemes is poorly developed. Under 
appropriate growth conditions, a counterselectable gene promotes the death 
of the microorganisms harboring it. Hence, transf ormants which have 
integrated a suicide vector containing a counterselectable marker, either 
' by a single event of homologous or illegitimate recombination, 
retain a copy of the counterselectable marker in the chromosome and are 
therefore eliminated in the presence of the counterselective compound. 
Consequently, counterselectable markers have been used for the positive 
selection of mutants that have undergone defined genetic 
alterations leading to the loss of the marker. In different studies, 
applications such as the construction of mutants, the isolation 
of insertion sequence (IS) elements, and the curing of plasmids 
have been described. The most-used counterselectable markers are the genes 
that confer sucrose, streptomycin, or fusaric acid sensitivity. They have 
been used to construct mutants or vaccine strains in 

Mycobacterium tuberculosis, Helicobacter pylori, Bordetella pertussis, and 
many other bacteria. Here we provide a short review of the situations in 
which the use of a counterselectable marker has proven to be, particularly 
advantageous . 
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AB RusA is a Holliday junction resolvase encoded by the cryptic prophage 

DLP12 of Escherichia coli K-12 that can be activated to promote homologous 

recombination and DNA repair in resolution-deficient 

mutants lacking the RuvABC proteins. Database 

searches with the 120 amino acid residue RusA sequence 

identified 11 homologues from diverse species, including one from the 

extreme thermophile Aquifex aeolicus, which suggests that RusA may be of 

ancient bacterial ancestry. A multiple alignment of these 

sequences revealed seven conserved or invariant acidic residues in 

the C-terminal half of the E. coli protein. By making 

site-directed mutations at these positions and analyzing the 

ability of the mutant proteins to promote DNA repair 

in vivo and to resolve junctions in vitro, we identified three aspartic 

acid residues (D70, D72 and D91) that are essential for catalysis and that 

provide the first insight into the active-site mechanism of junction 

resolution by RusA. Substitution of any one of these three residues with 

asparagine reduces resolution activity >80-fold. The mutant 

proteins retain the ability to bind junction DNA regardless of the 

DNA sequence or of the mobility of the crossover. 

They interfere with the function of the RuvABC proteins in vivo, 

when expressed from a multicopy plasmid, an effect that is reproducible in 
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vitro and that reflects the fact that the RusA proteins have a 

higher affinity for junction DNA in the presence of Mg2+ than do the RuvA 

and RuvC proteins. The D70N protein has a greater 

affinity for junctions in Mg2+ than does the wild-type, which indicates 
that the negatively charged .carboxyl group of the aspartate residue plays 
a critical role at the active site of RusA. Electrostatic repulsions 
between D70, D72 and D91 may help to form a classical Mg2+-binding pocket. 
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AB Shewanella oneidensis MR-1 is an obligatorily respiratory organism capable 
of growth with either oxygen or any of at least 12 other alternate 
electron acceptors in the absence of oxygen. We have identified, cloned 
and mutated the gene encoding ArcA, one of the primary 

regulators of the oxic to anoxic growth transition in Escherichia coli, 

from S. oneidensis MR-1. ArcA serves primarily as a negative regulator of 

aerobic metabolism in E. coli but also is involved in positive regulation 

of genes expressed under microaerobic conditions and some anaerobically 

expressed genes. The E. coli ArcA protein sequence 

was used to search the unfinished S. oneidensis MR-1 genome 

sequence database available from. TIGR. The MR-1 ArcA 

protein was 72-81% identical to homologs from E. coli, V. cholerae 

and H. influenzae, including the conserved phosphate-accepting aspartate 

residue at amino acid 54 in E. coli. The MR-1 arcA DNA sequence 

revealed that open reading frames both upstream and downstream of arcA are 

transcribed in the opposite direction from arcA, indicating that it is not 

part of an operon. The arcA gene is also followed by sequences 

typical of Rho-independent transcriptional terminators. No obvious 

binding sites for either Fnr ( TTGAT -ATCAA ) or ArcA ( (A/T) GTTAATTA (A/T) ) 

proteins were identified upstream of the MR-1 arcA gene but the 

gene was preceded by a tandem repeat of TGGTTA (G/A) AAT (A/T) T . No 

significant homolog of the ArcB protein has been detected in the 

MR-1 genome. The MR-1 arcA gene was cloned and mutated using 

crossover PCR to delete all except the first and last seven codons 

of the gene, separated by a 21 bp linker region encoding a unique Swal 

restricton site. This mutated gene, and a version containing a 

kanamycin resistance cassette cloned into the Swal site, was introduced 

into the MR-1 genome by homologous recombination to replace the 

wild-type allele. These mutants are being analyzed for the 

effects of the arcA mutations on aerobic, microaerobic, and 

anaerobic growth of MR-1, as well as synthesis of enzymes such as pyruvate 

dehydrogenase and 2-oxoglutarate dehydrogenase that are known to be 

repressed during anaerobic growth of MR-1. 
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method for the elucidation of the structure of the natural heptapeptide 
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optoid .mu. -selective dermorphin. The molecule was represented by its 31 
internal co-ordinates (torsion angles), which had to be optimized so as to 
minimize atomic overlap and distance-restraint violations. In this 
instance the crossover operator of the GA was not able to 

contribute significantly to its performance, which was therefore dependant 
mostly on selection and point mutation. As a result, the more 
sophisticated mutation/ select ion scheme imposed by simulated 
annealing outperformed the GA in structure elucidation. The GA found 
conformations of similar quality to those found by simulated annealing, 
but took three times as long to converge on the solutions. The theories of 
GA and simulated annealing are briefly described and the results are 
discussed. 
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NOVELTY - A device or integrated system (I), comprising: 

(1) a physical or logical array of reaction mixtures (II), each 
reaction mixture comprising shuffled or mutagenized nucleic 
acids (SNAs or MNAs) or transcribed SNAs or MNAs; and 
in vitro translation reagents (IVTLR), is new. 

DETAILED DESCRIPTION - INDEPENDENT CLAIMS are also included for the 
following: 

(1) a diversity generation device (III) comprising a programmed 
thermocycler and a fragmentation module coupled to the thermocycler ; 

(2) a diversity generation kit (IV) comprising (III) and one more 
reagents for diversity generation; 

(3) a method (Ml) of processing (SNAs) or (MNAs) comprising: 

(a) providing (II), in which a number of reaction mixtures (RM) 
comprise members of a first population of nucleic acids 

(Nal) comprising (SNAs), or transcribed (SNAs), or (MNAs) or transcribed 
(MNAs), where many of the (RM) further comprise an in vitro translation 
reactant (IVTLR); and 

(b) detecting in vitro translation products by the members of (II); 

(4) the physical or logical array of reaction mixtures (II) produced 
by (Ml);' 

(5) a method (M2) of recombining members of a physical or logical 
array of nucleic acids (V) comprising: 

(a) providing (Nal); or 

(b) providing a data structure comprising 
character strings corresponding to (Nal); 

(c) recombining members of (Nal) , providing a first population of 
recombinant nucleic acids (rNal); or x 

(d) recombining the character strings corresponding to members of 
(Nal), providing a population of character strings corresponding to 
(rNal), and into (rNal); 

(e) spatially or logically separating members of (rNal) to produce 
(V) and amplifying the recombinant nucleic acids in 

(V) in vitro ; or 

in vitro amplifying members of (rNal) and physically or logically 
separating them to produce an amplified (V) ; 

(6) a method (M3) of recombining members of (V) comprising: 

(a) providing (Nal) arranged in a physical or logical array; 

(b) recombining members of (Nal) with additional nucleic 
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acids, providing (rNal) ; 

(c) amplifying (rNal) in the physical or logical array; and 

(d) screening the first or amplified (rNal), or a duplicate, for a 
desired property; 

(7) a method (M4) of detecting or enriching for in vitro 
transcription or translation products comprising: 

(a) localizing first nucleic acids which encode 

moieties proximal to moiety recognition agents which specifically bind to 
them; 

in vitro translating or transcribing the nucleic 
acids, producing moieties which diffuse or flow into contact with 
the recognition agents; and 

(b) permitting binding of the moieties to the recognition agents, and 
detecting or enriching for moieties by detecting or collecting material 
proximal to, within or contiguous with the moiety recognition agent; 

(8) a solid substrate (VI) made by (M4); 

(9) a method (M5) of producing duplicate arrays of (SNAs) or (MNAs) 
comprising: v 

(a) providing (II); and 

(b) forming a duplicate array of copies of (II) by physically or 
logically organizing the copies into a physical or logical array; 

(10) the physical or logical array and duplicate array (VII) produced 
by (M5); 

(11) a method (M6) of normalizing an array of reaction mixtures 
comprising: 

in vitro transcribing or translating (II) to produce an array of 
products; and 

(a) determining a correction factor which accounts for variation in 
concentration of the products at different sites in the array of products; 

(12) the physical or logical array of (SNAs) or (MNAs) or transcribed 
(SNAs) or (MNAs), the array of products and the secondary array (VIII) 
produced by (M6) ; 

(13) a method (M7) of recombining one or more nucleic acids 
comprising: 

(a) immobilizing one or more template nucleic acids on a solid 
support; 

(b) annealing overlapping complementary nucleic acid fragments to the 
immobilized template nucleic acid; 

(c) extending or ligating the annealed fragments to produce at least 
one heteroduplex, which comprises a template nucleic acid and a 
substantially full-length heterolog complementary to the template nucleic 
acid; and 

(d) recovering at least one substantially full-length heterolog; 

(14) a full-length heterolog (IX) produced by (M7) 

(15) an array (X) comprising heteroduplexes or full-length heterologs 
produced by (M7); 

(16) an amplified heterolog (X) produced by (M7); 

(17) a vector (XII) produced by (M7); 

(18) a cell (XIII) produced by (M7); 

(19) a library of diversified heterologs (XIV) produced by (M7); 

(20) an integrated system (XV) comprising an array which comprises 
heteroduplexes or full-length heterologs produced by (M7); 

(21) a method (M8) of directing nucleic acid fragmentation using a 
computer, the method comprising calculating the ratio of uracil to 
thymidine which may then be used in a fragmentation module to produce one 
or more nucleic acid fragments of a selected length; 

(22) a method (M9) of directing polymerase chain reaction (PCR) using 
a computer, the method comprising calculating one or more crossover 
regions between two or more parental nucleic acid sequences using one or 
more annealing temperature or extension temperature ; 

(23) a method (M10) of selecting one or more parental nucleic acids 
for diversity generation using a computer comprising: 

(a) performing an alignment between two or more potential parental 
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nucleic acid sequences; 

(b) calculating a number of mismatches between alignment; 

(c) calculating a melting temperature for one' ore more window of w 
bases in the alignment; 

(d) identifying one or more window of w bases having a melting 
temperature greater than x; 

(e) identifying one or more crossover segment in the alignment which 
comprises two or more windows having a melting temperature greater than* x 
and that are separated by no more than n nucleotides; 

(f) calculating a dispersion of the one or more crossover segments; 

(g) calculating a first score for each alignment based on the number 
of windows having a melting temperature greater than x, the dispersion, 
and the number of crossover segments identified; 

(h) calculating a second score based on the number of mismatches, the 
number of windows having a melting temperature greater than x, the 
dispersion, and the number of crossover segments identified; and 

(i) selecting one or more parental nucleic acid based on the first 
score and/or the second score; and 

(24) a web page (XVI) for directing nucleic acid diversity generation 
comprising a computer readable medium that causes a computer to perform 
(M8) , (M9) or (M10) . 

USE - (I) is useful for performing nucleic acid recombination, 
mutation, shuffling, and other diversity generating reactions in vitro to 
generate diverse nucleic acids and screen for desirable properties of 
those nucleic acids such as their products, e.g., encoded RNAs (catalytic 
RNAs, ribozymes) or proteins. 

ADVANTAGE - Each aspect of diversity generation and downstream 
screening processes can be automated and used individually in separate 
modules or collectively in an integrated system or device. 
Dwg.0/33 
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NOVELTY - Preparing recombinant nucleic acid (I) from 
oligonucleotides which correspond to a set of character string 
subsequences (SCSS) comprising at least two parental character strings 
(PCS) corresponding to a number of nucleic acids, is 
new. 

DETAILED DESCRIPTION - Preparing recombinant nucleic 
acid (I) by aligning for maximum identity a number of parental 
character strings (PCS) corresponding to a number of nucleic 
acids, defining a set of character string subsequences (SCSS) 
comprising at least two of the PCS, providing a set of 
oligonucleotides corresponding to the SCSS and then annealing and 
elongating one or more oligonucleotides with polymerase or 
ligating at least two with ligase to produce (I) . 

INDEPENDENT CLAIMS are also included for the following: 

(1) preparing character strings (CS) by providing PCS encoding a 
polynucleotide or polypeptide, providing a set of 
oligonucleotide character strings of preselected length that 

encode a number of single-stranded oligonucleotide 
sequence comprising sequence fragments of PCS and its 
complement and creating a set of derivatives of parental sequence 
comprising sequence variant strings, a set of multiple 
mutations with one mutation per variant string; 

(2) a library prepared by the above said method; 

(3) facilitating recombination between two or more 
divergent nucleic acids by aligning PCS corresponding 
to divergent nucleic acids, identifying regions of 
sequence identity and regions of sequence diversity, 

defining a diplomat CS which is intermediate in PCS, synthesizing at least 
a portion of the diplomat sequence to produce a diplomat 
nucleic acid and recombining a mixture of parental 
nucleic acid and diplomat nucleic acid 

r 

(4) a mixture of selected nucleic acids produced 
by the above said method; 
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(5) generating and recombining nucleic acids by 
inputting a number of amino acid sequence character strings 

(ASCS) into a digital system, reverse translating ASCS in the digital 
system to a number of nucleic acid character strings 

which are species codon biased in a selected expression host and with 
optimized sequence similarity between a number of 
nucleic acid character strings and synthesizing one or 
more oligonucleotides from one or more reverse translated 
nucleic acid sequences; 

(6) optimizing activity of a nucleic acid by 
parameterizing a number of nucleic acids or 

proteins to provide a set of multidimensional datapoints, 
extrapolating one or more postulated multidimensional datapoint from the 
set of multidimensional datapoints and converting the postulated 
multidimensional datapoint to a new CS corresponding to a postulated 
nucleic acid or protein; 

(7) providing a library of recombinant nucleic 
acids which is enriched for a sequence of interest and 

selecting the library by producing an initial library of at least about 
106 recombinant nucleic acids, comprising at least 

about 105 different non-identical units, hybridizing the library to one or 

more population of nucleic acids that correspond to 

one or more subsequences in the different library units; 

(8) the enriched library produced by the above said method; 

(9) generating a library of biological polymers by 

generating a diverse population of CS in a computer, which in turn are 
generated by alteration of pre-existing CS, synthesizing the diverse 
population of CS in which diverse population comprises the library of 
biological polymers; and 

(10) an integrated" system comprising a computer having a first data 
set comprising a first CS, a second data set- comprising a second CS, 
software for aligning the first and second CSs, software for performing a 
genetic operation on the first or second CS, an output file comprising a 
third data set comprising a third CS, the third CS comprising CS 
subsequences from the first and second CSs, and an oligonucleotide 
sequence output file comprising a plurality of overlapping 
oligonucleotide sequences corresponding to third CS . 

USE - The method is useful for rapid evolution of nucleic 
acids in vitro and in vivo and provides for generation of encoded 
molecules with new and/or improved properties. Proteins and 
nucleic acids of industrial, agricultural and 

therapeutic importance can be created or improved through DNA shuffling 
procedures . 

ADVANTAGE - Physical access to genes or organisms is not required as 
sequence information is used for design and selection of oligo. 
Extensive sequence information is provided and sequences 
from inaccessible, non cultivable organisms can also be used. 
Sequences from pathogens without actual handling of pathogens and 
all type sequences including damaged and incomplete genes are 
amenable to this method. All genetic operators and crossovers can be fully 
and independently controlled in a reproducible fashion removing human 
error and variability from physical experiments with DNA manipulations. 
Sequences with frame -shift mutations are eliminated or 
fixed. Wild type parents do not contaminate derivative libraries with 
multiple redundant parental molecules. 
Dwg.0/15 
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^ABSTRACT IS AVAILABLE IN THE ALL AND I ALL FORMATS* 
Arabidopsis has emerged as an important model for the analysis of 
meiosis in Angiosperm plants, creating an interesting and useful parallel 
to other model organisms . This development has been underpinned by 
advances in the molecular biology and genetics of Arabidopsis, especially 
the determination of its entire genome sequence. However, these 
advances alone would have been insufficient without the development of 
improved methods for cytological analysis and cytogenetic investigation of 
meiotic nuclei and chromosomes. A basic descriptive framework of meiosis 
in Arabidopsis has been established based on these procedures. In 
addition, molecular cytogenetic and immunocytological techniques have 
provided supplementary detailed information on some aspects of meiosis. 
Gene identification and characterization have proceeded in parallel with 
these developments based on both forward and reverse genetic procedures 
utilising the considerable range of Arabidopsis genetic and molecular 
resources, such as T-DNA and transposon tagged lines as well as the 
genomic DNA database, in combination with cytological analysis. 
A diverse range of meiotic genes have been identified and analysed by 
these procedures and in selected cases they have been subjected to 
detailed functional analysis. This review focuses on genes that are 
involved in the key meiotic events of chromosome synapsis and 
recombination . 
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* ABSTRACT IS AVAILABLE IN THE ALL AND I ALL FORMATS* 
Bacteroides conjugative transposons (CTns) are thought to transfer by 
first excising themselves from the chromosome to form a nonreplicating 
circle, which is then transferred by conjugation to a recipient. Earlier 
studies showed that transfer of most Bacteroides CTns is stimulated by 
tetracycline, but it was not known which step in transfer is regulated. We 
have cloned and sequenced both ends of the Bacteroides CTn, CTnDOT, and 
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have used this information to examine excision and integration events. A 
segment of DNA that contains the joined ends of CTnDOT and an adjacent 
open reading frame (ORF) , intDOT, was necessary and sufficient for 
integration into the Bacteroides chromosome. Integration of this miniature 
form of the CTn was not regulated by tetracycline. Excision of CTnDOT and 
formation of the circular intermediate were detected by PCR, using primers 
designed from the end sequences. Sequence analysis of the PCR products 
revealed that excision and integration involve a 5-bp coupling 
sequence-type mechanism possibly similar to that used by CTn Tn916, a CTn 
found originally in enterococci. PCR analysis also demonstrated that 
excision is a tetracycline-regulated step in transfer. The integrated 
minielement containing intDOT and the ends of CTnDOT did not excise, nor 
did a larger minielement that also contained an ORF located immediately 
dowmstream of intDOT designated orf2. Thus, excision involves other genes 
besides intDOT and orf 2 . Both intDOT and orf 2 were disrupted by 
single-crossover insertions. Analysis of the disruption 
mutants showed that intDOT was essential for excision but orf 2 was not. 
Despite its proximity to the integrase gene, orf2 appears not to be 
essential for excision. 
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. ^ABSTRACT IS AVAILABLE IN THE ALL AND I ALL FORMATS* 
In mammalian cells, repair of DNA double-strand breaks (DSBs) occurs by 
both homologous and nonhomologous mechanisms. By definition, homologous 
recombination requires a template with sufficient sequence 
identity to the damaged molecule in order to direct repair, We now show 
that the sister chromatid acts as a repair template in a substantial 
proportion of DSB repair events. The outcome of sister chromatid repair is 
primarily gene conversion unassociated with reciprocal exchange. This 
contrasts with expectations from the classical DSB repair model originally 
proposed for yeast meiotic recombination, but is consistent with 
models in which recombination is coupled intimately 

with replication. These results may explain why cytologically observable 
sister chromatid exchanges are induced only weakly by DNA-damaging agents 
that cause strand breaks, since most homologous repair events would not be 
observed. A preference for non-crossover events between sister 
chromatids suggests that crossovers, although genetically silent, mag be 
disfavored for other reasons, possibly, a general bias against 
crossing over in mitotic cells exists to reduce the 

potential for genome alterations when other homologous repair templates 
are utilized. x 



L105 ANSWER 31 OF 33 SCISEARCH COPYRIGHT 2004 THOMSON ISI on STN 
ACCESSION NUMBER: 1998:53584 SCISEARCH 
THE GENUINE ARTICLE: YP7 69 



Searched by Barb O'Bryen, STIC 308-4291 



Zhou 09/863765 p a ge 38 



AUTHOR: 

CORPORATE SOURCE: 



COUNTRY OF AUTHOR: 
SOURCE : 



DOCUMENT TYPE: 
FILE SEGMENT: 
LANGUAGE : 
REFERENCE COUNT: 



TITLE : The nature of polymorphism of the HLA class I non-coding 

regions and their contribution to the diversification of 
HLA 

Blasczyk R (Reprint) ; Kotsch K; Wehling J 
HUMBOLDT UNIV BERLIN, DEPT INTERNAL MED, DIV HEMATOL & 
ONCOL, BLOODBANK, VIRCHOW KLINIKUM, D-13353 BERLIN, 
GERMANY (Reprint) 
GERMANY 

HEREDITAS, (OCT 1997) Vol. 127, No. 1-2, pp. 7-9. 
Publisher: HEREDITAS-DISTRIBUTION, GJORLOFFSGATAN 121, 261 
34 LANDSKRONA, SWEDEN. 
ISSN: 0018-0661. 
Article; Journal 
LIFE 
English 
10 

* ABSTRACT IS AVAILABLE IN THE ALL AND I ALL FORMATS* 
AB The sequence database of HLA class I genes is 

mainly derived from mRNA analysis. Little is known about the non-coding 
sequences of the different class I alleles. In this study we have 
determined the sequence of the 1st through 3rd introns of the 
majority of HLA- A and -B alleles. The few published sequences 
emerged to contain substantial errors. The introns turned out to be highly 
polymorphic with a variability of 14.6% in the 1st intron decreasing to 
6.2% in the 3rd intron. Against all expectations, this variability is not 
characterised by random point mutations but by a highly 
systematic diversity reflecting the ancestral relationship of the HLA 
alleles. The variability is arrested on the level of the serological 
diversity. The striking conservation within each ancestral lineage 
suggests that point mutations have been negatively selected. 
This finding could be explained by the evolutionary pressure on base 
order, promoting the potential to extrude single-strand stem-loops from 
supercoiled duplex DNA, which is believed to be important for . 
recombination. Moreover, the GC content was found to be as high as 
78% in the ist and 2nd introns and 55% in the 3rd intron. These CpG 
islands are directly involved in the exchange of short stretches of DNA in 
unequal crossing-over events. Additionally, conversion 
between different class I sequences is Facilitated by regions of 
strong homology, stabilizing the pairing of variable regions. All these 
observations indicate the potential of a substantial contribution of 
introns to the re combinational activity of class I genes. The exclusive 
clustering of CpG islands in the Ist and 2nd introns restricts the gene 
conversion events to the regions of the 2nd and 3rd exons and therefore 
protects the conservation of the 5 ? flanking region and the 3' part of the 
gene. Since there are less diversification forces acting on introns they 
may be more conserved in a trans-species manner than exons. Therefore, 
they could provide the answer for the controversy regarding intra-or 
trans-species evolution. 
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^ABSTRACT IS AVAILABLE IN THE ALL AND IALL FORMATS* 

AB We present data for the initial construction of a molecular linkage map 

for the, domesticated silkworm, Bombyx mori, based on 52 progeny from an F2 
cross from a pair mating of inbred strains p50 and C108, using restriction 
fragment length polymorphisms (RFLPs) . The map contains 15 characterized 
single copy sequences, 36 anonymous sequences derived 

from a follicular cDNA library, and 10 loci corresponding to a low copy 
number retrotransposon, mag. The 15 linkage groups and 8 ungrouped loci 
account for 23 of the 28 chromosomes and span a total 
recombination length of 413 cM; 10 linkage groups were correlated 
with established classic genetic maps. Scoring data from Southern blots 
were analysed using two Pascal programs written specifically to analyse 
linkage data in Lepidoptera, where females are the heterogametic sex and 
have achiasmatic meiosis (no crossing-over) . These 

first examine evidence for linkage by calculating the maximum lod score 
under the hypothesis that the two loci are linked over the likelihood 
under the hypothesis that the two loci assort independently, and then 
determine multilocus linkage maps for groups of putatively syntenic loci 
by calculating the maximum likelihood estimate of the 
recombination fractions and the log likelihood using the EM 
algorithm for a specified order of loci along the chromosome. In 
addition, the possibility of spurious linkage was exhaustively tested by 
searching for genotypes forbidden by the absence of crossing- 
over in one sex. 
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^ABSTRACT IS AVAILABLE IN THE ALL AND IALL FORMATS* 

AB A methodology was developed to construct any desired chromosomal 

mutation in the gene cluster that encodes the actinorhodin polyketide 
synthase (PKS) of Streptomyces coelicolor A3 (2). A positive selection 
marker (resistance gene) is first introduced by double crossing- 
over into the chromosomal site of interest by use of an unstable 
delivery plasmid. This marker is subsequently replaced by the desired 
mutant allele via a second high-frequency double recombination 
event. The technology has been used to: (i) explore the significance of 
translational coupling between two adjacent PKS genes; (ii) 

prove that the acyl carrier protein (ACP) encoded by a gene in the cluster 
is necessary for the function of the actinorhodin PKS; (iii) provide 
genetic evidence supporting the hypothesis that serine 42 is the site of 
phosphopantetheinylation in the ACP of the actinorhodin PKS; and (iv) 
demonstrate that this ACP can be replaced by a Saccharopolyspora fatty 
acid synthase ACP to generate an active hybrid PKS. 
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